Identification of foreign-accented French using data mining techniques

نویسندگان

  • Bianca Vieru-Dimulescu
  • Philippe Boula de Mareüil
  • Martine Adda-Decker
چکیده

The goal of this study is to automatically differentiate foreign accents in French (Arabic, English, German, Italian, Spanish and Portuguese). We took advantage of automatic alignment into phonemes of non-native French recordings to measure vowel formants, consonant duration and voicing, prosodic cues as well as pronunciation variants combining French and foreign acoustic units (e.g. a rolled ‘r’). We retained 62 features which we used for training a classifier to discriminate speakers’ foreign accents. We obtained over 50% correct identification by using a cross-validation method including unseen data (a few sentences from 36 speakers). The best automatically selected features are robust to corpus change and make sense with regard to linguistic knowledge.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Role of segmental and suprasegmental cues in the perception of Maghrebian-accented French

The general objective of this study is to clear up the relative importance of prosody in the identification of a foreign accent. The methodology we propose, based on the prosody transplantation paradigm, can be applied to different languages or language varieties. Here, it is applied to Magrhrebian-accented French: we wanted to study what is perceived when the segmental and suprasegmental chara...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Identification of Fraud in Banking Data and Financial Institutions Using Classification Algorithms

In recent years, due to the expansion of financial institutions,as well as the popularity of the World Wide Weband e-commerce, a significant increase in the volume offinancial transactions observed. In addition to the increasein turnover, a huge increase in the number of fraud by user’sabnormality is resulting in billions of dollars in lossesover the world. T...

متن کامل

Foreign and regional accents in French. Characterisation and identification

This research focuses on the identification and characterisation of accents in French. For both foreign and regional accents, we started with perceptual identification experiments, we measured phonetic features which may characterise these accents using automatic phoneme alignment, and we ranked the most discriminating features by using classification techniques. The following features are perc...

متن کامل

Identification of the Patient Requirements Using Lean Six Sigma and Data Mining

Lean health care is one of new managing approaches putting the patient at the core of each change. Lean construction is based on visualization for understanding and prioritizing imporvments. By using only visualization techniques, so much important information could be missed. In order to prioritize and select improvements, it’s essential to integrate new analysis tools to achieve a good unders...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007